Math 70 Homework 2 Michael Downs. 2(w i b) 0 = (w i b) 0 = 0 = b = 1 n. w i b 2 = nb = = z iz i 2z iaˆd i + ˆd 2 i

Size: px
Start display at page:

Download "Math 70 Homework 2 Michael Downs. 2(w i b) 0 = (w i b) 0 = 0 = b = 1 n. w i b 2 = nb = = z iz i 2z iaˆd i + ˆd 2 i"

Transcription

1 1. I show that minimizing n x i b aĉ i 2 with x i, b, a R n and a 2 = 1 is the same as maximizing the variance of Za. The problem here is finding an n-dimensional best fit line with direction vector a and displacement vector b to the n-dimensional points x i. In order to do that we have to find a, b, and each ĉ i that minimize the residual error from each x i to the line. Each ĉ i represents the projection of the point x i onto the line in R 1. I will first find b by holding a and each ĉ i fixed and letting w i = x i aĉ i. We then wish to minimize n w i b 2 with respect to b: ) b w i b 2 = = ( b w i b 2 2(w i b) Setting equal to 0 and solving for b: 0 = 0 = 0 = b = nb = b = 1 n 2(w i b) (w i b) w i w i w i w i b So b = 1 n n w i = w = x a c. Let z i = x i x and ˆd i = ĉ i c. We must now find ˆd i such that each z i aˆd i 2 is minimized. z i aˆd i 2 = (z i aˆd i ) (z i aˆd i ) = z iz i 2z iaˆd i + ˆd 2 i Taking the derivative: ˆd i (z iz i 2z iaˆd i + ˆd 2 i ) = 2z ia + 2ˆd i 1

2 Setting equal to 0 and solving we get that ˆd i = z ia. Returning to the problem of minimizing n z i aˆd i 2 = n z i a(z ia) 2 : z i a(z ia) 2 = = = (z iz i 2z ia(z ia) + (z ia) 2 ) (z iz i (z ia) 2 ) z iz i (z ia) 2 n (z ia) 2 is a sum of positive terms so in order to minimize n z i a(z ia) 2 we have to maximize n (z ia) 2 = n ((x i x) a) 2 which is the same as maximizing the variance of Za. 2. Let V be the matrix whose columns are the (normalized) eigenvectors of Z Z in decreasing order. Then the first column of ZV is the projection vector Za. Then we can rewrite Z via singular value decomposition and use the fact that V is orthogonal to get: ZV = UΣV V = UΣ The matrix multiplication UΣ scales each column of U by the eigenvalue in the corresponding column of Σ. Thus Za is proportional to the first column of U. 3. Code used: 1 X = read. csv ( / Users / Michael / Desktop /R/ stud. perform. csv, header=t) #read in data n=nrow (X) ;m=ncol (X) 3 X = as. matrix (X[ 1 : n, 2 :m] ) # get r i d o f f i r s t column which numbers students 5 # compute Z m = as. matrix ( colmeans (X) ) 7 one = as. matrix ( rep ( 1, n ) ) Z = X one% %t (m) 9 tzz = t (Z)% %Z 11 # get a EG = e i g e n ( tzz, symmetric=t) 13 a = EG$ v e c t o r s [, 1 ] val = EG$ v a l u e s 15 p r o j=as. v e c t o r (X% %a ) 17 #p l o t p l o t ( proj, rep ( 1, n ), xlab= PCA p r o j e c t i o n s, ylab= ) 2

3 19 t i t l e ( P r o j e c t i o n on the maximum e i g e n v e c t o r ) 21 # get 5 best which ( p r o j %in% s o r t ( p r o j ) [ 1 : 5 ] ) 23 #q u a l i t y o f p r o j e c t i o n onto l i n e : 25 val [ 1 ] /sum( val ) which outputs (the proper order should be ): > hw2(3) [1] "The five best students: " [1] [1] "Quality of projection onto line: " [1] and the plot: Projection on the maximum eigenvector PCA projections 4. Code: 1 X=as. v e c t o r ( scan ( / Users / Michael / Desktop /R/ comwebhits. dat ) ) p l o t. e c d f (X, do. p o i n t s = FALSE, v e r t i c a l s = TRUE, xlab = Time o f the website hit, hour, ylab = P r o b a b i l i t y, cdf ) 3 # Get the median 5 med = X[ round ( l e n g t h (X) / 2) ] 7 #check f = e c d f (X) 9 p r i n t ( median : ) 3

4 p r i n t (med) 11 p r i n t ( e c d f (med) : ) p r i n t ( f (med) ) 13 # p r o p o r t i o n o f h i t s a f t e r 8 pm (20 in m i l i t a r y time ) : 15 p r i n t ( p o r t i o n o f h i t s a f t e r 8 pm: ) p r i n t (1 f (20) ) which outputs: > hw2(4) Read 100 items [1] "median: " [1] [1] "ecdf(med): " [1] 0.5 [1] "portion of hits after 8 pm: " [1] 0.06 and the plot: ecdf(x) Probability, cdf Time of the website hit, hour 5. Using R to draw n observations from U(0, 1). Code (this is a segment from the main hw2 file, the parameter n is passed into that function): # n s o r t e d o b s e r v a t i o n s from U( 0, 1 ) 2 obs = s o r t ( r u n i f ( n ) ) 4 # p l o t e m p i r i c a l cdf 4

5 6 p l o t. e c d f ( obs, do. p o i n t s = FALSE, v e r t i c a l s = TRUE, xlab = x, ylab = P r o b a b i l i t y, c o l= red, main = paste ( e c d f ( x ) vs cdf ( x ) f o r, t o S t r i n g ( n ), o b s e r v a t i o n s ) ) # p l o t the t h e o r e t i c a l cdf 8 x = seq ( from =.2, to = 1. 2, by =. 0 1 ) l i n e s ( x, p u nif ( x ), l, c o l= green ) 10 legend ( bottomright, c ( e m p i r i c a l, t h e o r e t i c a l ), l t y =1, c o l=c ( red, green ), bty= n, cex =.75) Plotting n = 10 and n = 1000: > par(mfrow=c(1,2)) > hw2(5,10) > hw2(5,1000) outputs: ecdf(x) vs cdf(x) for 10 observationsecdf(x) vs cdf(x) for 1000 observation Probability empirical theoretical Probability empirical theoretical x x Entire hw2.r: 1 hw2=f u n c t i o n ( problem =3, n=1000) { 3 dump( / Users / Michael / Desktop /R/hw2. r ) i f ( problem==3) 5 { X = read. csv ( / Users / Michael / Desktop /R/ stud. perform. csv, header=t) #read in data 7 n=nrow (X) ;m=ncol (X) X = as. matrix (X[ 1 : n, 2 :m] ) # get r i d o f f i r s t column which numbers students 5

6 9 # compute Z 11 m = as. matrix ( colmeans (X) ) one = as. matrix ( rep ( 1, n ) ) 13 Z = X one% %t (m) tzz = t (Z)% %Z 15 # get a 17 EG = e i g e n ( tzz, symmetric=t) a = EG$ v e c t o r s [, 1 ] 19 val = EG$ v a l u e s p r o j=as. v e c t o r (X% %a ) 21 #p l o t 23 p l o t ( proj, rep ( 1, n ), xlab= PCA p r o j e c t i o n s, ylab= ) t i t l e ( P r o j e c t i o n on the maximum e i g e n v e c t o r ) 25 # get 5 best 27 p r i n t ( The f i v e best s tudents : ) p r i n t ( which ( p r o j %in% s o r t ( p r o j ) [ 1 : 5 ] ) ) 29 #q u a l i t y o f p r o j e c t i o n onto l i n e : 31 p r i n t ( Quality o f p r o j e c t i o n onto l i n e : ) p r i n t ( val [ 1 ] /sum( val ) ) 33 } i f ( problem==4) 35 { X=as. v e c t o r ( scan ( / Users / Michael / Desktop /R/ comwebhits. dat ) ) 37 p l o t. e c d f (X, do. p o i n t s = FALSE, v e r t i c a l s = TRUE, xlab = Time o f the website hit, hour, ylab = P r o b a b i l i t y, cdf ) 39 # Get the median med = X[ round ( l e n g t h (X) / 2) ] 41 #check 43 f = e c d f (X) p r i n t ( median : ) 45 p r i n t (med) p r i n t ( e c d f (med) : ) 47 p r i n t ( f (med) ) 49 # p r oportion o f h i t s a f t e r 8 pm (20 in m i l i t a r y time ) : p r i n t ( p o r t i o n o f h i t s a f t e r 8 pm: ) 51 p r i n t (1 f (20) ) } 53 i f ( problem==5) { 55 # n s o r t e d o b s e r v a t i o n s from U( 0, 1 ) obs = s o r t ( r u n i f ( n ) ) 57 # p l o t e m p i r i c a l cdf 59 p l o t. e c d f ( obs, xlim=c ( 0.2,1.2), ylim=c ( 0, 1 ), do. p o i n t s = FALSE, v e r t i c a l s = TRUE, xlab = x, ylab = P r o b a b i l i t y, c o l= red, main = paste ( e c d f ( x ) vs cdf ( x ) f o r, t o S t r i n g ( n ), o b s e r v a t i o n s ) ) 6

7 61 # p l o t the t h e o r e t i c a l cdf x = seq ( from =.2, to = 1. 2, by =. 0 1 ) 63 l i n e s ( x, p unif ( x ), l, c o l= green ) 65 legend ( bottomright, c ( e m p i r i c a l, t h e o r e t i c a l ), l t y =1, c o l=c ( red, green ), bty= n, cex =.75) } 67 } 7

Linear Systems. Class 27. c 2008 Ron Buckmire. TITLE Projection Matrices and Orthogonal Diagonalization CURRENT READING Poole 5.4

Linear Systems. Class 27. c 2008 Ron Buckmire. TITLE Projection Matrices and Orthogonal Diagonalization CURRENT READING Poole 5.4 Linear Systems Math Spring 8 c 8 Ron Buckmire Fowler 9 MWF 9: am - :5 am http://faculty.oxy.edu/ron/math//8/ Class 7 TITLE Projection Matrices and Orthogonal Diagonalization CURRENT READING Poole 5. Summary

More information

Problem # Max points possible Actual score Total 120

Problem # Max points possible Actual score Total 120 FINAL EXAMINATION - MATH 2121, FALL 2017. Name: ID#: Email: Lecture & Tutorial: Problem # Max points possible Actual score 1 15 2 15 3 10 4 15 5 15 6 15 7 10 8 10 9 15 Total 120 You have 180 minutes to

More information

Singular Value Decomposition

Singular Value Decomposition Chapter 6 Singular Value Decomposition In Chapter 5, we derived a number of algorithms for computing the eigenvalues and eigenvectors of matrices A R n n. Having developed this machinery, we complete our

More information

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas Dimensionality Reduction: PCA Nicholas Ruozzi University of Texas at Dallas Eigenvalues λ is an eigenvalue of a matrix A R n n if the linear system Ax = λx has at least one non-zero solution If Ax = λx

More information

Principal Component Analysis

Principal Component Analysis Principal Component Analysis Yuanzhen Shao MA 26500 Yuanzhen Shao PCA 1 / 13 Data as points in R n Assume that we have a collection of data in R n. x 11 x 21 x 12 S = {X 1 =., X x 22 2 =.,, X x m2 m =.

More information

Math 70 Homework 8 Michael Downs. and, given n iid observations, has log-likelihood function: k i n (2) = 1 n

Math 70 Homework 8 Michael Downs. and, given n iid observations, has log-likelihood function: k i n (2) = 1 n 1. (a) The poisson distribution has density function f(k; λ) = λk e λ k! and, given n iid observations, has log-likelihood function: l(λ; k 1,..., k n ) = ln(λ) k i nλ ln(k i!) (1) and score equation:

More information

Chapter 7: Symmetric Matrices and Quadratic Forms

Chapter 7: Symmetric Matrices and Quadratic Forms Chapter 7: Symmetric Matrices and Quadratic Forms (Last Updated: December, 06) These notes are derived primarily from Linear Algebra and its applications by David Lay (4ed). A few theorems have been moved

More information

Measurement, Scaling, and Dimensional Analysis Summer 2017 METRIC MDS IN R

Measurement, Scaling, and Dimensional Analysis Summer 2017 METRIC MDS IN R Measurement, Scaling, and Dimensional Analysis Summer 2017 Bill Jacoby METRIC MDS IN R This handout shows the contents of an R session that carries out a metric multidimensional scaling analysis of the

More information

1 Singular Value Decomposition and Principal Component

1 Singular Value Decomposition and Principal Component Singular Value Decomposition and Principal Component Analysis In these lectures we discuss the SVD and the PCA, two of the most widely used tools in machine learning. Principal Component Analysis (PCA)

More information

Collaborative Filtering: A Machine Learning Perspective

Collaborative Filtering: A Machine Learning Perspective Collaborative Filtering: A Machine Learning Perspective Chapter 6: Dimensionality Reduction Benjamin Marlin Presenter: Chaitanya Desai Collaborative Filtering: A Machine Learning Perspective p.1/18 Topics

More information

The University of Texas at Austin Department of Electrical and Computer Engineering. EE381V: Large Scale Learning Spring 2013.

The University of Texas at Austin Department of Electrical and Computer Engineering. EE381V: Large Scale Learning Spring 2013. The University of Texas at Austin Department of Electrical and Computer Engineering EE381V: Large Scale Learning Spring 2013 Assignment Two Caramanis/Sanghavi Due: Tuesday, Feb. 19, 2013. Computational

More information

Singular Value Decomposition

Singular Value Decomposition Singular Value Decomposition Motivatation The diagonalization theorem play a part in many interesting applications. Unfortunately not all matrices can be factored as A = PDP However a factorization A =

More information

DATA MINING LECTURE 8. Dimensionality Reduction PCA -- SVD

DATA MINING LECTURE 8. Dimensionality Reduction PCA -- SVD DATA MINING LECTURE 8 Dimensionality Reduction PCA -- SVD The curse of dimensionality Real data usually have thousands, or millions of dimensions E.g., web documents, where the dimensionality is the vocabulary

More information

Example Linear Algebra Competency Test

Example Linear Algebra Competency Test Example Linear Algebra Competency Test The 4 questions below are a combination of True or False, multiple choice, fill in the blank, and computations involving matrices and vectors. In the latter case,

More information

1 Principal Components Analysis

1 Principal Components Analysis Lecture 3 and 4 Sept. 18 and Sept.20-2006 Data Visualization STAT 442 / 890, CM 462 Lecture: Ali Ghodsi 1 Principal Components Analysis Principal components analysis (PCA) is a very popular technique for

More information

COMPLEX PRINCIPAL COMPONENT SPECTRA EXTRACTION

COMPLEX PRINCIPAL COMPONENT SPECTRA EXTRACTION COMPLEX PRINCIPAL COMPONEN SPECRA EXRACION PROGRAM complex_pca_spectra Computing principal components o begin, click the Formation attributes tab in the AASPI-UIL window and select program complex_pca_spectra:

More information

Principal Components Analysis

Principal Components Analysis Principal Components Analysis Nathaniel E. Helwig Assistant Professor of Psychology and Statistics University of Minnesota (Twin Cities) Updated 16-Mar-2017 Nathaniel E. Helwig (U of Minnesota) Principal

More information

EDAMI DIMENSION REDUCTION BY PRINCIPAL COMPONENT ANALYSIS

EDAMI DIMENSION REDUCTION BY PRINCIPAL COMPONENT ANALYSIS EDAMI DIMENSION REDUCTION BY PRINCIPAL COMPONENT ANALYSIS Mario Romanazzi October 29, 2017 1 Introduction An important task in multidimensional data analysis is reduction in complexity. Recalling that

More information

Notes on singular value decomposition for Math 54. Recall that if A is a symmetric n n matrix, then A has real eigenvalues A = P DP 1 A = P DP T.

Notes on singular value decomposition for Math 54. Recall that if A is a symmetric n n matrix, then A has real eigenvalues A = P DP 1 A = P DP T. Notes on singular value decomposition for Math 54 Recall that if A is a symmetric n n matrix, then A has real eigenvalues λ 1,, λ n (possibly repeated), and R n has an orthonormal basis v 1,, v n, where

More information

PRINCIPAL COMPONENTS ANALYSIS

PRINCIPAL COMPONENTS ANALYSIS PRINCIPAL COMPONENTS ANALYSIS Iris Data Let s find Principal Components using the iris dataset. This is a well known dataset, often used to demonstrate the effect of clustering algorithms. It contains

More information

Lecture 5 Singular value decomposition

Lecture 5 Singular value decomposition Lecture 5 Singular value decomposition Weinan E 1,2 and Tiejun Li 2 1 Department of Mathematics, Princeton University, weinan@princeton.edu 2 School of Mathematical Sciences, Peking University, tieli@pku.edu.cn

More information

Numerical Methods I Singular Value Decomposition

Numerical Methods I Singular Value Decomposition Numerical Methods I Singular Value Decomposition Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 MATH-GA 2011.003 / CSCI-GA 2945.003, Fall 2014 October 9th, 2014 A. Donev (Courant Institute)

More information

Explore the data. Anja Bråthen Kristoffersen Biomedical Research Group

Explore the data. Anja Bråthen Kristoffersen Biomedical Research Group Explore the data Anja Bråthen Kristoffersen Biomedical Research Group density 0.2 0.4 0.6 0.8 Probability distributions Can be either discrete or continuous (uniform, bernoulli, normal, etc) Defined by

More information

Space-time data. Simple space-time analyses. PM10 in space. PM10 in time

Space-time data. Simple space-time analyses. PM10 in space. PM10 in time Space-time data Observations taken over space and over time Z(s, t): indexed by space, s, and time, t Here, consider geostatistical/time data Z(s, t) exists for all locations and all times May consider

More information

Principal Component Analysis

Principal Component Analysis Principal Component Analysis Yingyu Liang yliang@cs.wisc.edu Computer Sciences Department University of Wisconsin, Madison [based on slides from Nina Balcan] slide 1 Goals for the lecture you should understand

More information

The Mathematics of Facial Recognition

The Mathematics of Facial Recognition William Dean Gowin Graduate Student Appalachian State University July 26, 2007 Outline EigenFaces Deconstruct a known face into an N-dimensional facespace where N is the number of faces in our data set.

More information

MLCC 2015 Dimensionality Reduction and PCA

MLCC 2015 Dimensionality Reduction and PCA MLCC 2015 Dimensionality Reduction and PCA Lorenzo Rosasco UNIGE-MIT-IIT June 25, 2015 Outline PCA & Reconstruction PCA and Maximum Variance PCA and Associated Eigenproblem Beyond the First Principal Component

More information

LINEAR ALGEBRA KNOWLEDGE SURVEY

LINEAR ALGEBRA KNOWLEDGE SURVEY LINEAR ALGEBRA KNOWLEDGE SURVEY Instructions: This is a Knowledge Survey. For this assignment, I am only interested in your level of confidence about your ability to do the tasks on the following pages.

More information

Computational functional genomics

Computational functional genomics Computational functional genomics (Spring 2005: Lecture 8) David K. Gifford (Adapted from a lecture by Tommi S. Jaakkola) MIT CSAIL Basic clustering methods hierarchical k means mixture models Multi variate

More information

CS246: Mining Massive Data Sets Winter Only one late period is allowed for this homework (11:59pm 2/14). General Instructions

CS246: Mining Massive Data Sets Winter Only one late period is allowed for this homework (11:59pm 2/14). General Instructions CS246: Mining Massive Data Sets Winter 2017 Problem Set 2 Due 11:59pm February 9, 2017 Only one late period is allowed for this homework (11:59pm 2/14). General Instructions Submission instructions: These

More information

Designing Information Devices and Systems I Spring 2017 Babak Ayazifar, Vladimir Stojanovic Homework 4

Designing Information Devices and Systems I Spring 2017 Babak Ayazifar, Vladimir Stojanovic Homework 4 EECS 16A Designing Information Devices and Systems I Spring 2017 Babak Ayazifar, Vladimir Stojanovic Homework This homework is due February 22, 2017, at 2:59. Self-grades are due February 27, 2017, at

More information

Using Relative Distribution Software

Using Relative Distribution Software Michele L SHAFFER and Mark S HANDCOCK Using Relative Distribution Software Relative distribution methods are a nonparametric statistical approach to the comparison of distribution These methods combine

More information

STATISTICAL LEARNING SYSTEMS

STATISTICAL LEARNING SYSTEMS STATISTICAL LEARNING SYSTEMS LECTURE 8: UNSUPERVISED LEARNING: FINDING STRUCTURE IN DATA Institute of Computer Science, Polish Academy of Sciences Ph. D. Program 2013/2014 Principal Component Analysis

More information

Package clustergeneration

Package clustergeneration Version 1.3.4 Date 2015-02-18 Package clustergeneration February 19, 2015 Title Random Cluster Generation (with Specified Degree of Separation) Author Weiliang Qiu , Harry Joe

More information

Machine Learning - MT & 14. PCA and MDS

Machine Learning - MT & 14. PCA and MDS Machine Learning - MT 2016 13 & 14. PCA and MDS Varun Kanade University of Oxford November 21 & 23, 2016 Announcements Sheet 4 due this Friday by noon Practical 3 this week (continue next week if necessary)

More information

Ch 7 : One-Sample Test of Hypothesis

Ch 7 : One-Sample Test of Hypothesis Summer 2017 UAkron Dept. of Stats [3470 : 461/561] Applied Statistics Ch 7 : One-Sample Test of Hypothesis Contents 1 Preliminaries 3 1.1 Test of Hypothesis...............................................................

More information

Principal Component Analysis (PCA)

Principal Component Analysis (PCA) Principal Component Analysis (PCA) Additional reading can be found from non-assessed exercises (week 8) in this course unit teaching page. Textbooks: Sect. 6.3 in [1] and Ch. 12 in [2] Outline Introduction

More information

Relations Between Adjacency And Modularity Graph Partitioning: Principal Component Analysis vs. Modularity Component Analysis

Relations Between Adjacency And Modularity Graph Partitioning: Principal Component Analysis vs. Modularity Component Analysis Relations Between Adjacency And Modularity Graph Partitioning: Principal Component Analysis vs. Modularity Component Analysis Hansi Jiang Carl Meyer North Carolina State University October 27, 2015 1 /

More information

The Singular Value Decomposition

The Singular Value Decomposition The Singular Value Decomposition Philippe B. Laval KSU Fall 2015 Philippe B. Laval (KSU) SVD Fall 2015 1 / 13 Review of Key Concepts We review some key definitions and results about matrices that will

More information

This is a closed book exam. No notes or calculators are permitted. We will drop your lowest scoring question for you.

This is a closed book exam. No notes or calculators are permitted. We will drop your lowest scoring question for you. Math 54 Fall 2017 Practice Exam 2 Exam date: 10/31/17 Time Limit: 80 Minutes Name: Student ID: GSI or Section: This exam contains 7 pages (including this cover page) and 7 problems. Problems are printed

More information

LEC 2: Principal Component Analysis (PCA) A First Dimensionality Reduction Approach

LEC 2: Principal Component Analysis (PCA) A First Dimensionality Reduction Approach LEC 2: Principal Component Analysis (PCA) A First Dimensionality Reduction Approach Dr. Guangliang Chen February 9, 2016 Outline Introduction Review of linear algebra Matrix SVD PCA Motivation The digits

More information

1 Model Economy. 1.1 Demographics

1 Model Economy. 1.1 Demographics 1 Model Economy To quantify the effects demographic dynamics have on savings, investment, rate of return on the factors of production, and, finally, current account, we calibrate a two-country general

More information

Principal Component Analysis

Principal Component Analysis Principal Component Analysis CS5240 Theoretical Foundations in Multimedia Leow Wee Kheng Department of Computer Science School of Computing National University of Singapore Leow Wee Kheng (NUS) Principal

More information

Final Exam - Take Home Portion Math 211, Summer 2017

Final Exam - Take Home Portion Math 211, Summer 2017 Final Exam - Take Home Portion Math 2, Summer 207 Name: Directions: Complete a total of 5 problems. Problem must be completed. The remaining problems are categorized in four groups. Select one problem

More information

Pareto Distribution. June 10, James W. Stoutenborough and Paul Johnson June 10, 2013

Pareto Distribution. June 10, James W. Stoutenborough and Paul Johnson June 10, 2013 Pareto Distribution June 10, 2013 James W. Stoutenborough and Paul Johnson June 10, 2013 1 Introduction The Pareto distribution is a probability model for continuous variables.

More information

The Singular Value Decomposition (SVD) and Principal Component Analysis (PCA)

The Singular Value Decomposition (SVD) and Principal Component Analysis (PCA) Chapter 5 The Singular Value Decomposition (SVD) and Principal Component Analysis (PCA) 5.1 Basics of SVD 5.1.1 Review of Key Concepts We review some key definitions and results about matrices that will

More information

Orthogonal Complements

Orthogonal Complements Orthogonal Complements Definition Let W be a subspace of R n. If a vector z is orthogonal to every vector in W, then z is said to be orthogonal to W. The set of all such vectors z is called the orthogonal

More information

Online Appendix to Mixed Modeling for Irregularly Sampled and Correlated Functional Data: Speech Science Spplications

Online Appendix to Mixed Modeling for Irregularly Sampled and Correlated Functional Data: Speech Science Spplications Online Appendix to Mixed Modeling for Irregularly Sampled and Correlated Functional Data: Speech Science Spplications Marianne Pouplier, Jona Cederbaum, Philip Hoole, Stefania Marin, Sonja Greven R Syntax

More information

Basics of Multivariate Modelling and Data Analysis

Basics of Multivariate Modelling and Data Analysis Basics of Multivariate Modelling and Data Analysis Kurt-Erik Häggblom 6. Principal component analysis (PCA) 6.1 Overview 6.2 Essentials of PCA 6.3 Numerical calculation of PCs 6.4 Effects of data preprocessing

More information

Machine Learning (CSE 446): Unsupervised Learning: K-means and Principal Component Analysis

Machine Learning (CSE 446): Unsupervised Learning: K-means and Principal Component Analysis Machine Learning (CSE 446): Unsupervised Learning: K-means and Principal Component Analysis Sham M Kakade c 2019 University of Washington cse446-staff@cs.washington.edu 0 / 10 Announcements Please do Q1

More information

1. Background: The SVD and the best basis (questions selected from Ch. 6- Can you fill in the exercises?)

1. Background: The SVD and the best basis (questions selected from Ch. 6- Can you fill in the exercises?) Math 35 Exam Review SOLUTIONS Overview In this third of the course we focused on linear learning algorithms to model data. summarize: To. Background: The SVD and the best basis (questions selected from

More information

ICS 6N Computational Linear Algebra Symmetric Matrices and Orthogonal Diagonalization

ICS 6N Computational Linear Algebra Symmetric Matrices and Orthogonal Diagonalization ICS 6N Computational Linear Algebra Symmetric Matrices and Orthogonal Diagonalization Xiaohui Xie University of California, Irvine xhx@uci.edu Xiaohui Xie (UCI) ICS 6N 1 / 21 Symmetric matrices An n n

More information

Computation. For QDA we need to calculate: Lets first consider the case that

Computation. For QDA we need to calculate: Lets first consider the case that Computation For QDA we need to calculate: δ (x) = 1 2 log( Σ ) 1 2 (x µ ) Σ 1 (x µ ) + log(π ) Lets first consider the case that Σ = I,. This is the case where each distribution is spherical, around the

More information

Bootstrap tests. Patrick Breheny. October 11. Bootstrap vs. permutation tests Testing for equality of location

Bootstrap tests. Patrick Breheny. October 11. Bootstrap vs. permutation tests Testing for equality of location Bootstrap tests Patrick Breheny October 11 Patrick Breheny STA 621: Nonparametric Statistics 1/14 Introduction Conditioning on the observed data to obtain permutation tests is certainly an important idea

More information

Eigenvalues, Eigenvectors, and an Intro to PCA

Eigenvalues, Eigenvectors, and an Intro to PCA Eigenvalues, Eigenvectors, and an Intro to PCA Eigenvalues, Eigenvectors, and an Intro to PCA Changing Basis We ve talked so far about re-writing our data using a new set of variables, or a new basis.

More information

Advanced Introduction to Machine Learning

Advanced Introduction to Machine Learning 10-715 Advanced Introduction to Machine Learning Homework 3 Due Nov 12, 10.30 am Rules 1. Homework is due on the due date at 10.30 am. Please hand over your homework at the beginning of class. Please see

More information

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen PCA. Tobias Scheffer

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen PCA. Tobias Scheffer Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen PCA Tobias Scheffer Overview Principal Component Analysis (PCA) Kernel-PCA Fisher Linear Discriminant Analysis t-sne 2 PCA: Motivation

More information

Principal Component Analysis

Principal Component Analysis Principal Component Analysis Anders Øland David Christiansen 1 Introduction Principal Component Analysis, or PCA, is a commonly used multi-purpose technique in data analysis. It can be used for feature

More information

Structure in Data. A major objective in data analysis is to identify interesting features or structure in the data.

Structure in Data. A major objective in data analysis is to identify interesting features or structure in the data. Structure in Data A major objective in data analysis is to identify interesting features or structure in the data. The graphical methods are very useful in discovering structure. There are basically two

More information

GI07/COMPM012: Mathematical Programming and Research Methods (Part 2) 2. Least Squares and Principal Components Analysis. Massimiliano Pontil

GI07/COMPM012: Mathematical Programming and Research Methods (Part 2) 2. Least Squares and Principal Components Analysis. Massimiliano Pontil GI07/COMPM012: Mathematical Programming and Research Methods (Part 2) 2. Least Squares and Principal Components Analysis Massimiliano Pontil 1 Today s plan SVD and principal component analysis (PCA) Connection

More information

Covariance and Correlation Matrix

Covariance and Correlation Matrix Covariance and Correlation Matrix Given sample {x n } N 1, where x Rd, x n = x 1n x 2n. x dn sample mean x = 1 N N n=1 x n, and entries of sample mean are x i = 1 N N n=1 x in sample covariance matrix

More information

Canadian climate: function-on-function regression

Canadian climate: function-on-function regression Canadian climate: function-on-function regression Sarah Brockhaus Institut für Statistik, Ludwig-Maximilians-Universität München, Ludwigstraße 33, D-0539 München, Germany. The analysis is based on the

More information

Announcements Monday, November 20

Announcements Monday, November 20 Announcements Monday, November 20 You already have your midterms! Course grades will be curved at the end of the semester. The percentage of A s, B s, and C s to be awarded depends on many factors, and

More information

HW #3 Solutions: M552 Spring (c)

HW #3 Solutions: M552 Spring (c) HW #3 Solutions: M55 Spring 6. (4.-Trefethen & Bau: parts (a), (c) and (e)) Determine the SVDs of the following matrices (by hand calculation): (a) [ 3 (c) (e) [ ANS: In each case we seek A UΣV. The general

More information

PCA, Kernel PCA, ICA

PCA, Kernel PCA, ICA PCA, Kernel PCA, ICA Learning Representations. Dimensionality Reduction. Maria-Florina Balcan 04/08/2015 Big & High-Dimensional Data High-Dimensions = Lot of Features Document classification Features per

More information

Principal Component Analysis

Principal Component Analysis CSci 5525: Machine Learning Dec 3, 2008 The Main Idea Given a dataset X = {x 1,..., x N } The Main Idea Given a dataset X = {x 1,..., x N } Find a low-dimensional linear projection The Main Idea Given

More information

Edexcel GCE A Level Maths Further Maths 3 Matrices.

Edexcel GCE A Level Maths Further Maths 3 Matrices. Edexcel GCE A Level Maths Further Maths 3 Matrices. Edited by: K V Kumaran kumarmathsweebly.com kumarmathsweebly.com 2 kumarmathsweebly.com 3 kumarmathsweebly.com 4 kumarmathsweebly.com 5 kumarmathsweebly.com

More information

Principal Component Analysis. Applied Multivariate Statistics Spring 2012

Principal Component Analysis. Applied Multivariate Statistics Spring 2012 Principal Component Analysis Applied Multivariate Statistics Spring 2012 Overview Intuition Four definitions Practical examples Mathematical example Case study 2 PCA: Goals Goal 1: Dimension reduction

More information

Applied Linear Algebra in Geoscience Using MATLAB

Applied Linear Algebra in Geoscience Using MATLAB Applied Linear Algebra in Geoscience Using MATLAB Contents Getting Started Creating Arrays Mathematical Operations with Arrays Using Script Files and Managing Data Two-Dimensional Plots Programming in

More information

Linear Algebra Review. Fei-Fei Li

Linear Algebra Review. Fei-Fei Li Linear Algebra Review Fei-Fei Li 1 / 37 Vectors Vectors and matrices are just collections of ordered numbers that represent something: movements in space, scaling factors, pixel brightnesses, etc. A vector

More information

Multivariate Statistics (I) 2. Principal Component Analysis (PCA)

Multivariate Statistics (I) 2. Principal Component Analysis (PCA) Multivariate Statistics (I) 2. Principal Component Analysis (PCA) 2.1 Comprehension of PCA 2.2 Concepts of PCs 2.3 Algebraic derivation of PCs 2.4 Selection and goodness-of-fit of PCs 2.5 Algebraic derivation

More information

Statistics for Applications. Chapter 9: Principal Component Analysis (PCA) 1/16

Statistics for Applications. Chapter 9: Principal Component Analysis (PCA) 1/16 Statistics for Applications Chapter 9: Principal Component Analysis (PCA) 1/16 Multivariate statistics and review of linear algebra (1) Let X be a d-dimensional random vector and X 1,..., X n be n independent

More information

Explore the data. Anja Bråthen Kristoffersen

Explore the data. Anja Bråthen Kristoffersen Explore the data Anja Bråthen Kristoffersen density 0.2 0.4 0.6 0.8 Probability distributions Can be either discrete or continuous (uniform, bernoulli, normal, etc) Defined by a density function, p(x)

More information

Preprocessing & dimensionality reduction

Preprocessing & dimensionality reduction Introduction to Data Mining Preprocessing & dimensionality reduction CPSC/AMTH 445a/545a Guy Wolf guy.wolf@yale.edu Yale University Fall 2016 CPSC 445 (Guy Wolf) Dimensionality reduction Yale - Fall 2016

More information

R Demonstration ANCOVA

R Demonstration ANCOVA R Demonstration ANCOVA Objective: The purpose of this week s session is to demonstrate how to perform an analysis of covariance (ANCOVA) in R, and how to plot the regression lines for each level of the

More information

Machine Learning (Spring 2012) Principal Component Analysis

Machine Learning (Spring 2012) Principal Component Analysis 1-71 Machine Learning (Spring 1) Principal Component Analysis Yang Xu This note is partly based on Chapter 1.1 in Chris Bishop s book on PRML and the lecture slides on PCA written by Carlos Guestrin in

More information

Robust scale estimation with extensions

Robust scale estimation with extensions Robust scale estimation with extensions Garth Tarr, Samuel Müller and Neville Weber School of Mathematics and Statistics THE UNIVERSITY OF SYDNEY Outline The robust scale estimator P n Robust covariance

More information

Package gma. September 19, 2017

Package gma. September 19, 2017 Type Package Title Granger Mediation Analysis Version 1.0 Date 2018-08-23 Package gma September 19, 2017 Author Yi Zhao , Xi Luo Maintainer Yi Zhao

More information

Principal component analysis

Principal component analysis Principal component analysis Angela Montanari 1 Introduction Principal component analysis (PCA) is one of the most popular multivariate statistical methods. It was first introduced by Pearson (1901) and

More information

Linear Algebra & Geometry why is linear algebra useful in computer vision?

Linear Algebra & Geometry why is linear algebra useful in computer vision? Linear Algebra & Geometry why is linear algebra useful in computer vision? References: -Any book on linear algebra! -[HZ] chapters 2, 4 Some of the slides in this lecture are courtesy to Prof. Octavia

More information

Example: Face Detection

Example: Face Detection Announcements HW1 returned New attendance policy Face Recognition: Dimensionality Reduction On time: 1 point Five minutes or more late: 0.5 points Absent: 0 points Biometrics CSE 190 Lecture 14 CSE190,

More information

Introduction to Simple Linear Regression

Introduction to Simple Linear Regression Introduction to Simple Linear Regression 1. Regression Equation A simple linear regression (also known as a bivariate regression) is a linear equation describing the relationship between an explanatory

More information

2/26/2017. This is similar to canonical correlation in some ways. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

2/26/2017. This is similar to canonical correlation in some ways. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 What is factor analysis? What are factors? Representing factors Graphs and equations Extracting factors Methods and criteria Interpreting

More information

Linear Algebra Methods for Data Mining

Linear Algebra Methods for Data Mining Linear Algebra Methods for Data Mining Saara Hyvönen, Saara.Hyvonen@cs.helsinki.fi Spring 2007 The Singular Value Decomposition (SVD) continued Linear Algebra Methods for Data Mining, Spring 2007, University

More information

CSE 554 Lecture 7: Alignment

CSE 554 Lecture 7: Alignment CSE 554 Lecture 7: Alignment Fall 2012 CSE554 Alignment Slide 1 Review Fairing (smoothing) Relocating vertices to achieve a smoother appearance Method: centroid averaging Simplification Reducing vertex

More information

Ch 5 : Probability To Statistics

Ch 5 : Probability To Statistics Summer 2017 UAkron Dept. of Stats [3470 : 461/561] Applied Statistics Ch 5 : Probability To Statistics Contents 1 Random Sampling 2 1.1 Probability and Statistics...........................................................

More information

I. Multiple Choice Questions (Answer any eight)

I. Multiple Choice Questions (Answer any eight) Name of the student : Roll No : CS65: Linear Algebra and Random Processes Exam - Course Instructor : Prashanth L.A. Date : Sep-24, 27 Duration : 5 minutes INSTRUCTIONS: The test will be evaluated ONLY

More information

Lecture 5 Supspace Tranformations Eigendecompositions, kernel PCA and CCA

Lecture 5 Supspace Tranformations Eigendecompositions, kernel PCA and CCA Lecture 5 Supspace Tranformations Eigendecompositions, kernel PCA and CCA Pavel Laskov 1 Blaine Nelson 1 1 Cognitive Systems Group Wilhelm Schickard Institute for Computer Science Universität Tübingen,

More information

hp calculators HP 20b Probability Distributions The HP 20b probability distributions Practice solving problems involving probability distributions

hp calculators HP 20b Probability Distributions The HP 20b probability distributions Practice solving problems involving probability distributions HP 20b Probability Distributions The HP 20b probability distributions Practice solving problems involving probability distributions The HP 20b probability distributions The HP 20b contains functions to

More information

Summary of Week 9 B = then A A =

Summary of Week 9 B = then A A = Summary of Week 9 Finding the square root of a positive operator Last time we saw that positive operators have a unique positive square root We now briefly look at how one would go about calculating the

More information

A Tutorial on Data Reduction. Principal Component Analysis Theoretical Discussion. By Shireen Elhabian and Aly Farag

A Tutorial on Data Reduction. Principal Component Analysis Theoretical Discussion. By Shireen Elhabian and Aly Farag A Tutorial on Data Reduction Principal Component Analysis Theoretical Discussion By Shireen Elhabian and Aly Farag University of Louisville, CVIP Lab November 2008 PCA PCA is A backbone of modern data

More information

Package msir. R topics documented: April 7, Type Package Version Date Title Model-Based Sliced Inverse Regression

Package msir. R topics documented: April 7, Type Package Version Date Title Model-Based Sliced Inverse Regression Type Package Version 1.3.1 Date 2016-04-07 Title Model-Based Sliced Inverse Regression Package April 7, 2016 An R package for dimension reduction based on finite Gaussian mixture modeling of inverse regression.

More information

Machine learning for pervasive systems Classification in high-dimensional spaces

Machine learning for pervasive systems Classification in high-dimensional spaces Machine learning for pervasive systems Classification in high-dimensional spaces Department of Communications and Networking Aalto University, School of Electrical Engineering stephan.sigg@aalto.fi Version

More information

Singular Value Decomposition. 1 Singular Value Decomposition and the Four Fundamental Subspaces

Singular Value Decomposition. 1 Singular Value Decomposition and the Four Fundamental Subspaces Singular Value Decomposition This handout is a review of some basic concepts in linear algebra For a detailed introduction, consult a linear algebra text Linear lgebra and its pplications by Gilbert Strang

More information

What is Principal Component Analysis?

What is Principal Component Analysis? What is Principal Component Analysis? Principal component analysis (PCA) Reduce the dimensionality of a data set by finding a new set of variables, smaller than the original set of variables Retains most

More information

Central limit theorem - go to web applet

Central limit theorem - go to web applet Central limit theorem - go to web applet Correlation maps vs. regression maps PNA is a time series of fluctuations in 500 mb heights PNA = 0.25 * [ Z(20N,160W) - Z(45N,165W) + Z(55N,115W) - Z(30N,85W)

More information

be a Householder matrix. Then prove the followings H = I 2 uut Hu = (I 2 uu u T u )u = u 2 uut u

be a Householder matrix. Then prove the followings H = I 2 uut Hu = (I 2 uu u T u )u = u 2 uut u MATH 434/534 Theoretical Assignment 7 Solution Chapter 7 (71) Let H = I 2uuT Hu = u (ii) Hv = v if = 0 be a Householder matrix Then prove the followings H = I 2 uut Hu = (I 2 uu )u = u 2 uut u = u 2u =

More information

Matrix Vector Products

Matrix Vector Products We covered these notes in the tutorial sessions I strongly recommend that you further read the presented materials in classical books on linear algebra Please make sure that you understand the proofs and

More information

Lecture: Face Recognition and Feature Reduction

Lecture: Face Recognition and Feature Reduction Lecture: Face Recognition and Feature Reduction Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning Lab 1 Recap - Curse of dimensionality Assume 5000 points uniformly distributed in the

More information

Lecture 8. Principal Component Analysis. Luigi Freda. ALCOR Lab DIAG University of Rome La Sapienza. December 13, 2016

Lecture 8. Principal Component Analysis. Luigi Freda. ALCOR Lab DIAG University of Rome La Sapienza. December 13, 2016 Lecture 8 Principal Component Analysis Luigi Freda ALCOR Lab DIAG University of Rome La Sapienza December 13, 2016 Luigi Freda ( La Sapienza University) Lecture 8 December 13, 2016 1 / 31 Outline 1 Eigen

More information